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<110> APPLICANT: Dunn - Coleman , Nigel 
Goedegebuur , Frits 
Ward, Michael 
YaO/ Jian 

TITLE OF INVENTION: BGL5 Beta-Glucosidase 
Encoding the Same 
FILE REFERENCE: GC697 
CURRENT APPLICATION NUMBER: 
CURRENT FILING DATE: iJO^-O: 
NUMBER OF SEQ ID NOS :\ 3 

SOFTWARE: FastSEQ for windows Version 4.0 
SEQ ID NO: 1 
LENGTH: 1991 
TYPE: DNA 

ORGANISM: Trichoderma reesei 
SEQUENCE: 1 



ENTERED 



<120> 

<130> 
<140> 
<141> 

<160> 
<170> 
<210> 
<211> 
<212> 
<213> 
<400> 

agccaggtcg caaccagcag cagcagcagc 
aatgcccgag tcgctagctc tgcccaacga 
ccagatcgaa ggcgccgtca aagaaggtgg 
ccacctggag ccatcgcgca ccaacggcgc 
ccgctacgat gaggactttg atctcttgac 
cttgtcgtgg tcgcggatca ttcccctcgg 
aattgagttt tacagcaaac tgattgacgc 
gactttgtac cactgggatc tgcctcaggc 
cgtggaagag gtccagctgg actttgagcg 
ggaccgagtc cagaactgga tcaccatcaa 
tgccaccggc agcaacgccc cgggcaggag 
cactgccact gagccgtggc tcgctggaaa 
ggccgtctac agcagggact ttcgcccctc 
cggcgactac tatgagccct gggacagcaa 
acggatggaa tttcacattg gctggtttgc 
agagagcatg aagaagcagc tgggcgagag 
catcctcaat gccggagaga ccgacttcta 
gcgccaccta gacggtcccg tccccgagac 
ggagaataag gacggcagcc ccgttggcga 
cccggacatg ttccggaagc atctcgcccg 
catcaccgag aacggatgcc cgtgccctgg 
caacgacccc ttccgcatcc gstactttga 
tacccaggac ggcgtcgtcg tcaaggggta 
atggtcagat ggctacggac ccagattcgg 
gcgcacgccc aagaagtctg ccctggtcct 
taaagtggcg gcataaagaa agggaaattt 
tctctctttt tccctccctc cccttgtccc 



and Nucleic Acids 




7026,140 



agtacagaga aatcaaccca gatagctcaa 
ctttgaatgg ggcttcgcaa cggccgccta 
ccgcggcccg tccatctggg acacgtactg 
caacggcgat gtggcttgcg atcactacca 
caagtacggc gcaaaggcct accgcttctc 
cggcaggctg gatcccgtca acgaggaggg 
cctgttgagg cggggtatca cgccttgggt 
gcttcacgat cgctatggag gctggctcaa 
gtatgcgagg ttgtgctttg aacgttttgg 
cgawccctgg attcaggcca tctatggata 
cagcattaac aagcactcca ccgagggcaa 
ggcccagatc atgagccatg cccgcgccgt 
gcaaaagggc cagatcggca tctcgctcaa 
tgagcctcgg gacaaggagg 
caatcccatc ttcttgaaga 
gcttccagcc ctcactcccg 
cggcatgaat tactacacat 
ggactatctc ggcgccatcc 
ggagagcggc ctcgcctggc 
ggtgtacggc ctgtacggca 



agaggagaac atgacgtgcg 
ctcgcacttg gactcgattt 
ctttgcgtgg gcgttgctcg 

cgtcacgttc acagactaca 
caaggacatg tttgcggccc 
cttcttgcat tcagcctcta 



ctgctgagcg 
aggactatcc 
cggactttgc 
cccagttcgc 
atgagcacca 
tgcgctcctg 
agcccatcta 
aggaggccgt 
ccaaggccat 
ataacttgga 
ccaccctcaa 
ggcagagggt 
tgcatcttcc 



tctctctcta cctctcatat tccctctata 
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4> 



RAW SEQUENCE LISTING DATE: 04/08/2002 

PATENT APPLICATION: US/10/026,14 0 TIME: 09:42:41 

Input Set : A:\GC697-SEQXiIST.TXT 

Output Set: N:\CRF3\04082002\J026140.raw 

54 ccccccgctt cttctcatga ccccatgctc cttgcccttg gcccctctct gtcgaattct 1680 

55 gcctcttatc acgtcttatg cgtctgttta cttgcctttt ttttttttgt ctctttctgt 1740 

56 ctgtctgtct gcctgtctat gtgtacctat ctggcccttc gctcattggc aacagatact 1800 

57 agcacaagtt caagcaagca agcacgcaag caagcaagca agccagccat caacggcatc 1860 

58 aaagccccat gtttagcctc atgttcacat tgctatgtta tctacatcag ccattcacta 1920 

59 ccaggcgaag aggccacaga gagtctcatc gtcttacctg tatatacgct tttttaaaaa 1980 

60 aaaaaaaaaa a 1991 

62 <210> SEQ ID NO: 2 

63 <211> LENGTH: 484 

64 <212> TYPE: PRT 

65 <213> ORGANISM: Trichoderma reesei 

67 <220> FEATURE: 

68 <221> NAME/KEY: VARIANT 

69 <222> LOCATION: (1)...(484) 

70 <223> OTHER INFORMATION: Xaa = Any Amino Acid 

72 <400> SEQUENCE: 2 

73 Met Pro Glu Ser Leu Ala Leu Pro Asn Asp Phe Glu Trp Gly Phe Ala 

74 1 5 10 15 

75 Thr Ala Ala Tyr Gin He Glu Gly Ala Val Lys Glu Gly Gly Arg Gly 

76 20 25 30 

77 Pro Ser He Trp Asp Thr Tyr Cys His Leu Glu Pro Ser Arg Thr Asn 

78 35 40 45 

79 Gly Ala Asn Gly Asp Val Ala Cys Asp His Tyr His Arg Tyr Asp Glu 

80 50 55 60 

81 Asp Phe Asp Leu Leu Thr Lys Tyr Gly Ala Lys Ala Tyr Arg Phe Ser 

82 65 70 75 80 

83 Leu Ser Trp Ser Arg He He Pro Leu Gly Gly Arg Leu Asp Pro Val 

84 85 90 95 

85 Asn Glu Glu Gly He Glu Phe Tyr Ser Lys Leu He Asp Ala Leu Leu 

86 100 105 110 

87 Arg Arg Gly He Thr Pro Trp Val Thr Leu Tyr His Trp Asp Leu Pro 

88 115 120 125 

89 Gin Ala Leu His Asp Arg Tyr Gly Gly Trp Leu Asn Val Glu Glu Val 

90 130 135 140 

91 Gin Leii Asp Phe Glu Arg Tyr Ala Arg Leu Cys Phe Glu Arg Phe Gly 

92 145 150 155 160 

93 Asp Arg Val Gin Asn Trp He Thr He Asn Xaa Fro Trp He Gin Ala 

94 165 170 175 

95 He Tyr Gly Tyr Ala Thr Gly Ser Asn Ala Pro Gly Arg Ser Ser He 

96 180 185 190 

97 Asn Lys His Ser Thr Glu Gly Asn Thr Ala Thr Glu Pro Trp Leu Ala 

98 195 200 205 

99 Gly Lys Ala Gin He Met Ser His Ala Arg Ala Val Ala Val Tyr Ser 

100 210 215 220 

101 Arg Asp Phe Arg Pro Ser Gin Lys Gly Gin He Gly He Ser Leu Asn 

102 225 230 235 240 

103 Gly Asp Tyr Tyr Glu Pro Trp Asp Ser Asn Glu Pro Arg Asp Lys Glu 

104 245 250 255 

105 Ala Ala Glu Arg Arg Met Glu Phe His He Gly Trp Phe Ala Asn Pro 
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RAW SEQUENCE LISTING DATE: 04/08/2002 

PATENT APPLICATION: US/10/D26 , 14 0 TIME: 09:42:42 

Input Set : A:\GC697-SEQLIST.TXT 

Output Set: N:\CRF3\04082002\J026140.raw 
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He 


Phe 


Leu 


Lys 


Lys 


Asp 


Tyr 


Pro 


Glu 


Ser 
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Gin 


Leu 


Gly 
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Pro 


Ala 
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Thr 


Pro 


Ala 
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Gly 


Glu 


Thr 
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Phe 
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Arg 


Ser 


Cys 


Pro 


Asp 


Met 


Phe 


Arg 


Lys 


His 


Leu 


118 






355 










360 










365 








119 


Ala 


Arg 


Val 


Tyr 


Gly 


Leu 


Tyr 


Gly 


Lys 


Pro 


He 


Tyr 


He 


Thr 


Glu 


Asn 


120 




370 










375 










380 










121 


Gly 


Cys 


Pro 


Cys 


Pro 


Gly 


Glu 


Glu 
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Met 


Thr 


Cys 


Glu 


Glu 


Ala 


Val 
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385 
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400 


123 


Asn 


Asp 


Pro 


Phe 


Arg 


He 


Arg 


Tyr 


Phe 


Asp 


Ser 


His 


Leu 


Asp 


Ser 


He 
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405 










410 










415 
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Ser 


Lys 


Ala 


He 


Thr 


Gin 


Asp 


Gly 


Val 


Val 


Val 


Lys 


Gly 


Tyr 


Phe 


Ala 
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420 










425 










430 






127 


Trp 


Ala 


Leu 


Leu 


Asp 


Asn 


Leu 


Glu 


Trp 


Ser 


Asp 


Gly 


Tyr 


Gly 


Pro 


Arg 


128 






435 
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129 


Phe 


Gly 


Val 


Thf 


Phe 


Thr 


Asp 


Tyr 


Thr 


Thr 


Leu 


Lys 


Arg 


Thr 


Pro 


Lys 


130 




450 










455 










460 










131 


Lys Ser Ala 


Leu 


Val 


Leu 


Lys 


Asp 


Met 


Phe 


Ala 


Ala 


Arg 


Gin 


Arg 


Val 
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465 
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Lys 


Val 


Ala 


Ala 



























136 <210> SEQ ID NO: 3 

137 <211> LENGTH: 1455 

138 <212> TYPE: DNA 

139 <213> ORGANISM: Trichoderma reesei 

141 <400> SEQUENCE: 3 

142 atgcccgagt cgctagctct gcccaacgac tttgaatggg gcttcgcaac ggccgcctac 60 

143 cagatcgaag gcgccgtcaa agaaggtggc cgcggcccgt ccatctggga cacgtactgc 120 

144 cacctggagc catcgcgcac caacggcgcc aacggcgatg tggcttgcga tcactaccac 180 
14 5 cgctacgatg aggactttga tctcttgacc aagtacggcg caaaggccta ccgcttctcc 240 

146 ttgtcgtggt cgcggatcat tcccctcggc ggcaggctgg atcccgtcaa cgaggaggga 300 

147 attgagtttt acagcaaact gattgacgcc ctgttgaggc ggggtatcac gccttgggtg 360 

148 actttgtacc actgggatct gcctcaggcg cttcacgatc gctatggagg ctggctcaac 420 

149 gtggaagagg tccagctgga ctttgagcgg tatgcgaggt tgtgctttga acgttttggg 480 

150 gaccgagtcc agaactggat caccatcaac gawccctgga ttcaggccat ctatggatat 540 

151 gccaccggca gcaacgcccc gggcaggagc agcattaaca agcactccac cgagggcaac 600 

152 actgccactg agccgtggct cgctggaaag gcccagatca tgagccatgc ccgcgccgtg 660 

153 gccgtctaca gcagggactt tcgcccctcg caaaagggcc agatcggcat ctcgctcaac 720 

154 ggcgactact atgagccctg ggacagcaat gagcctcggg acaaggaggc tgctgagcga 780 

155 cggatggaat ttcacattgg ctggtttgcc aatcccatct tcttgaagaa ggactatcca 840 

156 gagagcatga agaagcagct gggcgagagg cttccagccc tcactcccgc ggactttgcc 900 

157 atcctcaatg ccggagagac cgacttctac ggcatgaatt actacacatc ccagttcgcg 960 
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RAW SEQUENCE LISTING DATE: 04/08/2002 

PATENT APPLICATION: US/10/026,140 TIME: 09:42:42 

Input Set : A:\GC697-SEQLIST.TXT 

Output Set: N:\CRF3\04082002\J026140.raw 

158 cgccacctag acggtcccgt ccccgagacg gactatctcg gcgccatcca tgagcaccag 1020 

159 gagaataagg acggcagccc cgttggcgag gagagcggcc tcgcctggct gcgctcctgc 1080 

160 ccggacatgt tccggaagca tctcgcccgg gtgtacggcc tgtacggcaa gcccatctac 1140 

161 atcaccgaga acggatgccc gtgccctgga gaggagaaca tgacgtgcga ggaggccgtc 1200 

162 aacgacccct tccgcatccg stactttgac tcgcacttgg actcgatttc caaggccatt 1260 

163 acccaggacg gcgtcgtcgt caaggggta'c tttgcgtggg cgttgctcga taacttggaa 1320 

164 tggtcagatg gctacggacc cagattcggc gtcacgttca cagactacac caccctcaag 1380 

165 cgcacgccca agaagtctgc cctggtcctc aaggacatgt ttgcggcccg gcagagggtt 1440 

166 aaagtggcgg cataa 1455 
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VERIFICATION SXJMMARY DATE: 04/08/2002 

PATENT APPLICATION: US/10/026,140 TIME: 09:42:43 

Input Set : A:\GC697-SEQLIST.TXT 

Output Set: N:\CRF3\04082002\J026140 .raw 

L:15 M:271 C: Current Filing Date differs. Replaced Current Filing Date 
L:93 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:2 



file://C:\CRF3\Outhold\VsrJ026140.htm 



4/8/02 



